A Bad Instance for k-Means++

نویسندگان

  • Tobias Brunsch
  • Heiko Röglin
چکیده

k-means++ is a seeding technique for the k-means method with an expected approximation ratio of O(log k), where k denotes the number of clusters. Examples are known on which the expected approximation ratio of k-means++ is Ω(log k), showing that the upper bound is asymptotically tight. However, it remained open whether k-means++ yields an O(1)-approximation with probability 1/poly(k) or even with constant probability. We settle this question and present instances on which k-means++ achieves an approximation ratio of (2/3−ε) · log k only with exponentially small probability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identi cation of Bad Signatures in BatchesJaros

The paper addresses the problem of bad signature identii-cation in batch veriication of digital signatures. The number of generic tests necessary to identify all bad signatures in a batch instance, is used to measure the eeciency of veriiers. The divide-and-conquer veri-er DCV(x; n) is deened. The veriier identiies all bad signatures in a batch instance x of the length n by repeatedly splitting...

متن کامل

Identification of Bad Signatures in Batches

The paper addresses the problem of bad signature identification in batch verification of digital signatures. The number of generic tests necessary to identify all bad signatures in a batch instance, is used to measure the efficiency of verifiers. The divide-and-conquer verifier DCVα(x,n) is defined. The verifier identifies all bad signatures in a batch instance x of the length n by repeatedly s...

متن کامل

The global Minmax k-means algorithm

The global k-means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k-means to minimize the sum of the intra-cluster variances. However the global k-means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, afte...

متن کامل

A bad 2-dimensional instance for k-means++

The k-means++ seeding algorithm is one of the most popular algorithms that is used for finding the initial k centers when using the k-means heuristic. The algorithm is a simple sampling procedure and can be described as follows: Pick the first center randomly from among the given points. For i > 1, pick a point to be the i center with probability proportional to the square of the Euclidean dist...

متن کامل

A sharp threshold for a random constraint satisfaction problem

We consider random instances I of a constraint satisfaction problem generalizing k-SAT: given n boolean variables, m ordered k-tuples of literals, and q “bad” clause assignments, find an assignment which does not set any of the k-tuples to a bad clause assignment. We consider the case where k = Ω(log n), and generate instance I by including every k-tuple of literals independently with probabili...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 505  شماره 

صفحات  -

تاریخ انتشار 2011